#notes: README and Metadata add spatial data source

1 Rationale and Research Questions

1.1 Background

In the context of escalating global environmental challenges, the shift from traditional fossil fuel-based energy sources to renewable energy has become a focal point in efforts to reduce carbon emissions and combat climate change (Kabeyi et. al, 2022). However, the transition’s broader environmental impacts, particularly on air quality, remain less explored. This transition is especially relevant given the increasing global energy demand and the need to meet this demand sustainably.

1.2 Significance

Understanding the relationship between renewable energy generation and air quality is crucial. Renewable energy sources like wind and solar are lauded for their lower environmental impact compared to fossil fuels, which are major contributors to air pollution (UCSUSA, 2018). Air pollution is a significant environmental hazard, affecting human health, ecosystems, and the climate. It is responsible for millions of premature deaths annually and contributes to the occurrence of diseases like asthma, heart disease, and lung cancer (National Geographic, 2023). Therefore, assessing how increased renewable energy generation affects air quality indicators is not just an environmental concern but a public health imperative.

1.3 Theoretical Context

The hypothesis underlying this research is that an increase in renewable energy generation leads to a reduction in air pollution. This hypothesis is grounded in the understanding that renewable energy sources, unlike fossil fuels, do not emit pollutants like sulfur dioxide, nitrogen oxides, and particulate matter during electricity generation.

1.4 Research Questions

1. What is the relationship between distributed renewable energy generation and the level of air pollution?

This question aims to investigate the correlation between the rise in renewable energy generation and the concentrations of various air pollutants. It seeks to understand whether regions with higher renewable energy output exhibit lower levels of air pollutants.

2. Among air quality indicators (PM2.5, NO2, and SO2), which display the most significant response to variations in energy generation?

This question delves deeper into identifying which specific pollutants are most responsive to changes in energy generation types. It is crucial for pinpointing the environmental benefits of renewable energy sources and for policy-making aimed at targeted air pollution reduction.

2 Dataset Information

The exploratory analysis required the combination of air quality and power plant datasets. Air quality data in the analysis was obtained from the United States Environmental Protection Agency (EPA) while power plant data was obtained from the U.S. Energy Information Administration (EIA).Our analysis covered a period of two decades, from 2001 to 2021.

EPA has established air quality standards for six pollutants which include particular matter, ozone, sulfur dioxide, nitrogen dioxide, carbon monoxide, and lead. Among these, particular matter (PM2.5), nitrogen dioxide (NO2), and sulfur dioxide (SO2) are mainly produced as by-products of fossil fuel combustion and pose significant threats to human health (Perera et al. 2019, Perera 2017, Bridges et al. 2015). Thus, this study aims to investigate the relationship between the three air quality indicators, PM2.5, NO2, and SO2, and energy generation. We downloaded the pollutants data from the EPA website that is available from 1980 to 2021 (https://aqs.epa.gov/aqsweb/airdata/download_files.html#Daily). The data is at the city level and consists of number of observation points in each county. We calculated the daily mean by state to retrieve the monthly pollutant concentrations at the state-level.

We collected monthly and annual energy generation from power plants and power plant locations and capacities from two datasets, EIA-923 (https://www.eia.gov/electricity/data/eia923/) and EIA-860 (https://www.eia.gov/electricity/data/eia860/), respectively. The EIA-923 dataset provides us with detailed generator data such as generation, fuel consumption, and stocks. On the other hand, the EIA-860 dataset lists the capacities of power plants, including the generators that are operable, as well as the plant-level data for the surveyed generators, which includes their locations information.

Details of the wrangling process from raw to processed data can be referred to the “Code” file in the Git repository.

(#tab:Table: Dataset Information)Table 1. Dataset Information for the Sample Period 2001 - 2021
Dataset Source Variables
Air Quality Summary Statistics by Criteria Pollutants and Location EPA Air Quality System (AQS) Monthly PM2.5, SO2, and NO2 Concentrations
Power Plant Generator Level Capacities and Locations EIA Form EIA-860 Annual Installed Generation Capacity by Fuel Type
Power Plant Monthly Energy Generation EIA Form EIA-923 Monthly Net Generation by Fuel Type

3 Exploratory Analysis

We began our exploratory analysis by examining if and how solar and wind energy generating capacity, net generation and air quality has changed over time in the contiguous United States. We first used the wrangled power plant to visualize the changes in the total installed capacity of solar and wind plants from 2001 to 2021. Figure 1 shows that California, Texas, and Iowa hold the highest cumulative installed capacity, suggesting that these states have significant growth in their installed capacity compared to other states. In addition, we plotted the annual energy generation from solar and wind sources over the same period of time, observing that the states with the highest installed capacity also exhibit the largest annual energy generation from these renewable sources (Figure 2 and Figure 3).

Annual Installed Generation Capacity: Solar and Wind (MW)

(#fig:Visualization: Installed Generation Capacity)Annual Installed Generation Capacity: Solar and Wind (MW)

Annual Solar and Wind Energy Generation over time (TWh)

(#fig:Visualization: Annual Generation)Annual Solar and Wind Energy Generation over time (TWh)

Monthly Solar and Wind Energy Generation over time (TWh)

(#fig:Visualization: Monthly Generation)Monthly Solar and Wind Energy Generation over time (TWh)

The change in installed solar and wind plants can be also be visualized spatially across the contiguous United States as illustrated below. From the map, we note that the number of installed solar and wind plants increased significantly over the two decades from 2001 to 2021.

Figure 4. Solar and Wind Power Plants Locations and Capacities

We utilized ggplot and gganimate to visualize the increase and distribution of plants within the states that had the highest growth in installed capacity. We thus found the top three states and created a visualization that is shown as Figure 5.

Solar and Wind Installed Capacity in California, Texas, Iowa

(#fig:Create GIF)Solar and Wind Installed Capacity in California, Texas, Iowa

Figure 5. Solar and Wind Installed Capacity in California, Texas, Iowa

After analyzing the increasing installed capacity of solar and wind plants in the United States, with a special focus on California, Texas, and Iowa, we proceeded to determine the concentration changes of critical criteria pollutants over time. To simplify the process, we used visualizations to to present the data on the reduction of SO2, NO2, and PM2.5 over time, which are associated with fossil fuel generation. As per our findings below, it’s evident that the concentration of these pollutants has been steadily decreasing over the years.

Trends in Pollutant ConcentrationsTrends in Pollutant ConcentrationsTrends in Pollutant Concentrations

(#fig:Visualization Pollutants)Trends in Pollutant Concentrations

To investigate the trend in the measured pollutants over time, we conducted a time series analysis of the measured values of each of the three pollutants in California, Texas, and Iowa from 2001 to 2021. Our goal was to determine whether there has been a change in the recorded PM2.5, SO2, and NO2 concentrations over the sample period. The null hypothesis is that there has been no change in the recorded PM2.5, SO2 and NO2 concentrations in the three states over the sample period. The alternative hypothesis is that there has been a change in the recorded PM2.5, SO2 and NO2 concentrations in the three states over the sample period.

After decomposing the time series, we observed that each of the three pollutants has a seasonal component as observed in the time series plots shown below. Hence we run the seasonal Mann-Kendall (SMK) test on each dataset, which produced a p-value of less than 0.05 (<0.05) for each time series. As a result, we can reject the null hypothesis, and the analysis indicates that there has been a change in the recorded PM2.5, SO2, and NO2 concentrations over the period 2001 - 2021 in California, Texas, and Iowa. The negative tau values suggests a negative correlation which implies that the change for each pollutant in each state has been a decrease.

Time Series Analysis for PollutantsTime Series Analysis for PollutantsTime Series Analysis for PollutantsTime Series Analysis for PollutantsTime Series Analysis for PollutantsTime Series Analysis for PollutantsTime Series Analysis for PollutantsTime Series Analysis for PollutantsTime Series Analysis for Pollutants

(#fig:Time Series)Time Series Analysis for Pollutants

(#tab:Monotonic Trend Analysis)Table 2. Results of Seasonal Mann-Kendall test
Trend tau 2-sided pvalue
CA_PM2.5 -0.4198413 0
CA_SO2 -0.7531746 0
CA_NOX -0.7285714 0
TX_PM2.5 -0.3888889 0
TX_SO2 -0.4880952 0
TX_NOX -0.7150794 0
IA_PM2.5 -0.4238095 0
IA_SO2 -0.5976190 0
IA_NOX -0.6715447 0

4 Analysis

4.1 Question 1: What is the relationship between distributed renewable energy generation and the level of air pollution?

We can formulate a null and alternative hypothesis for the above research question as follows:

H0: There is no change in recorded air quality with an increase in renewable energy generation in the states of California, Texas and Iowa over the period 2001 - 2021.
Ha: There is a change in recorded air quality with an increase in renewable energy generation in the states of California, Texas and Iowa over the period 2001 - 2021.

To evaluate this hypothesis, we generated plots of mean pollutant concentrations measured against net monthly solar and wind generation for all three states.

Mean Pollutant Concentrations Measured against Net Monthly Solar and Wind GenerationMean Pollutant Concentrations Measured against Net Monthly Solar and Wind GenerationMean Pollutant Concentrations Measured against Net Monthly Solar and Wind GenerationMean Pollutant Concentrations Measured against Net Monthly Solar and Wind GenerationMean Pollutant Concentrations Measured against Net Monthly Solar and Wind GenerationMean Pollutant Concentrations Measured against Net Monthly Solar and Wind GenerationMean Pollutant Concentrations Measured against Net Monthly Solar and Wind GenerationMean Pollutant Concentrations Measured against Net Monthly Solar and Wind GenerationMean Pollutant Concentrations Measured against Net Monthly Solar and Wind Generation

Figure 4.1: Mean Pollutant Concentrations Measured against Net Monthly Solar and Wind Generation

The figures above suggest that there is an inverse or negative correlation between the measured levels of pollutants and net generation from solar and wind sources. This means that as the amount of energy generated from these sources increases, the quantity of the three criteria pollutants decreases. To investigate this further, we performed a simple linear regression analysis between the mean quantity of each pollutant and net energy generation. We recognized that there are other factors such as population growth, economic activities, and policy changes that might affect air quality, leading to biases in the linear regression model. To account for these factors, error terms were used in the models. You can find the results of the simple linear regression analysis in Table 3.

Characteristic PM2.5 SO2 NO2
Beta 95% CI1 p-value Beta 95% CI1 p-value Beta 95% CI1 p-value
California
Net Generation -0.77 -1.1, -0.46 <0.001 -0.21 -0.24, -0.18 <0.001 -1.8 -2.1, -1.6 <0.001
0.088

0.445

0.464

AIC 1,375

189

1,255

σ 3.68

0.349

2.89

Texas
Net Generation -0.28 -0.35, -0.20 <0.001 -0.09 -0.12, -0.07 <0.001 -0.47 -0.57, -0.37 <0.001
0.177

0.213

0.255

AIC 1,000

401

1,148

σ 1.75

0.532

2.34

Iowa
Net Generation -1.3 -1.6, -0.98 <0.001 -0.64 -0.74, -0.55 <0.001 -1.5 -1.8, -1.2 <0.001
0.214

0.423

0.352

AIC 1,161

563

1,053

σ 2.41

0.733

1.99

1 CI = Confidence Interval

PM2.5 and Renewable Generation

A unit increase in renewable generation is associated with 0.77, 0.28, and 1.3 unit decrease in PM2.5 concentration California, Texas and Iowa respectively. Assuming a confidence interval of 0.05, the p values are less than our confidence level which implies that the results are statistically significant and there is a significant negative correlation between PM2.5 concentration and renewable energy generation. Based on the R-squared values, 8.8%, 17.7% and 21.4% of the total variance in PM2.5 concentration in California, Texas and Iowa respectively can be explained by renewable energy generation.

SO2 and Renewable Generation

A unit increase in renewable generation is associated with 0.21, 0.09 and 0.64 unit decrease in SO2 concentration California, Texas and Iowa respectively. Assuming a confidence interval of 0.05, the p values are less than our confidence level which implies that the results are statistically significant and there is a significant negative correlation between SO2 concentration and renewable energy generation. Based on the R-squared values, 44.5%, 21.3% and 42.3% of the total variance in SO2 concentration in California, Texas and Iowa respectively can be explained by renewable energy generation.

NO2 and Renewable Generation

A unit increase in renewable generation is associated with 1.8, 0.47 and 1.5 unit decrease in NO2 concentration California, Texas and Iowa respectively. Assuming a confidence interval of 0.05, the p values are less than our confidence level which implies that the results are statistically significant and there is a significant negative correlation between NO2 concentration and renewable energy generation. Based on the R-squared values, 46.4%, 25.5% and 35.2% of the total variance in NO2 concentration in California, Texas and Iowa respectively can be explained by renewable energy generation.

In conclusion, these observations suggest that increased production of renewable energy may lead to decreased emissions of PM2.5, SO2, and NO2, which are pollutants typically associated with the combustion of fossil fuels.

4.2 Question 2: Among three air quality indicators (PM2.5, NO2, and SO2), which display the most significant response to variations in energy generation?

We can formulate a null and alternative hypothesis for the above research question as follows:

H0: There has been a uniform impact on all three air quality indicator by an increase in renewable energy generation in the states of California, Texas and Iowa over the period 2001 - 2021.
Ha: There has not been a uniform impact on all three air quality indicator by an increase in renewable energy generation in the states of California, Texas and Iowa over the period 2001 - 2021.

To evaluate the above research question, we plotted the distribution of quantity of the pollutants over time as demonstrated in the box plots below.

Distribution of Pollutant Levels across States

Figure 4.2: Distribution of Pollutant Levels across States

Distribution of Pollutant Levels across States

Figure 4.3: Distribution of Pollutant Levels across States

Distribution of Pollutant Levels across States

Figure 4.4: Distribution of Pollutant Levels across States

PM2.5 Levels

The distribution of PM2.5 levels varies widely between states. California shows a particularly high range of PM2.5 concentrations with notable outliers, indicating episodes of very poor air quality. It’s important to look into the reasons for California’s variability, such as wildfires or urban pollution.

NO2 Levels

NO2 levels are somewhat variable, with California showing a higher median concentration. This might be associated with industrial activities or high traffic density.

SO2 Levels

Iowa has historically had a higher range of SO2 concentrations which have fluctuated over time and trended downwards. This pollutant is often associated with industrial processes and the burning of sulfur-containing fuels such as coal. The coal mining and energy generation activity in Iowa could explain the relatively higher concentrations of SO2 measure relative to California and Texas.

5 Summary and Conclusions

According to our analysis, renewable energy generation in the United States has shown a general upward trend over time, indicating increased adoption and capacity. California (CA) has been leading the pack with a significantly higher generation, especially with a steep increase around 2020. Other states have also shown growth in renewable energy generation but to varying degrees, with Texas (TX) and Iowa (IA) exhibiting notable increases. The variability in generation over time could be influenced by factors such as state policies, technological advancements, and investment in renewable energy infrastructure. The overall increasing trend is in line with global efforts to transition to cleaner energy sources to reduce reliance on fossil fuels and combat climate change.

Furthermore, our analysis suggests that the measured value of pollutants has a negative correlation or inverse relationship with net generation from solar and wind in the three states. This means that the higher the amount of energy generated from wind and solar, the lower the amount of the three criteria pollutants. This is a promising indication that the use of renewable energy sources has a positive impact on reducing harmful emissions and promoting a cleaner environment. However, it’s important to acknowledge that there may be other factors at play that can affect air quality. Further analysis is needed to better understand the interrelationship between renewable energy generation and other factors such as transportation and industrial activities. Nonetheless, our findings provide valuable insights into the potential benefits of adopting renewable energy sources in reducing harmful emissions and improving air quality.

6 References

Air Pollution. https://education.nationalgeographic.org/resource/air-pollution. Accessed 6 Dec. 2023.

Bridges, A., Felder, F. A., McKelvey, K., & Niyogi, I. (2015). Uncertainty in energy planning: Estimating the health impacts of air pollution from fossil fuel electricity generation. Energy Research & Social Science, 6, 74-77.

Environmental Impacts of Renewable Energy Technologies | Union of Concerned Scientists. https://www.ucsusa.org/resources/environmental-impacts-renewable-energy-technologies. Accessed 6 Dec. 2023.

Kabeyi, Moses Jeremiah Barasa, and Oludolapo Akanni Olanrewaju. ‘Sustainable Energy Transition for Renewable and Low Carbon Grid Electricity Generation and Supply’. Frontiers in Energy Research, vol. 9, 2022. Frontiers, https://www.frontiersin.org/articles/10.3389/fenrg.2021.743114.

Perera FP. 2017. Multiple threats to child health from fossil fuel combustion: Impacts of air pollution and climate change. Environmental Health Perspectives 125:141-148.

Perera, F., Ashrafi, A., Kinney, P., & Mills, D. (2019). Towards a fuller assessment of benefits to children’s health of reducing air pollution and mitigating climate change due to fossil fuel combustion. Environmental research, 172, 55-72.